Skip to content

[CuTe] Add missing include for smem_ptr_flag_bits in print_tensor.hpp#3244

Open
LeSingh1 wants to merge 1 commit into
NVIDIA:mainfrom
LeSingh1:fix/print-tensor-missing-include
Open

[CuTe] Add missing include for smem_ptr_flag_bits in print_tensor.hpp#3244
LeSingh1 wants to merge 1 commit into
NVIDIA:mainfrom
LeSingh1:fix/print-tensor-missing-include

Conversation

@LeSingh1
Copy link
Copy Markdown

Summary

cute/util/print_tensor.hpp references smem_ptr_flag_bits<B> (line 92, in the print_layout overload for ComposedLayout), but its existing includes (config.hpp, layout.hpp, tensor_impl.hpp) do not transitively pull in cute/pointer_flagged.hpp, which is the header that defines smem_ptr_flag_bits.

In typical usage the header is reached via cute/tensor.hpp, which already includes pointer_flagged.hpp, so the bug is hidden. It surfaces when:

This patch adds the missing include so the header is self-contained.

Fix

 #include <cute/layout.hpp>
+#include <cute/pointer_flagged.hpp>  // cute::smem_ptr_flag_bits
 #include <cute/tensor_impl.hpp>

Mirrors the include style already used in cute/atom/mma_traits_sm90_gmma.hpp (#include <cute/pointer_flagged.hpp> // cute::smem_ptr_flag).

Verification

  • Confirmed by inspection that none of config.hpp, layout.hpp, tensor_impl.hpp transitively include pointer_flagged.hpp.
  • grep -rn "pointer_flagged" include/cute/ shows only tensor.hpp and mma_traits_sm90_gmma.hpp pulling it in elsewhere.
  • I do not have an NVCC toolchain on this machine to build the full example suite; if CI catches anything unexpected I'll iterate.

Issue

Fixes #3205

cute/util/print_tensor.hpp references smem_ptr_flag_bits<B> at line 92
in its print_layout overload for ComposedLayout, but only includes
config.hpp, layout.hpp, and tensor_impl.hpp -- none of which transitively
pull in cute/pointer_flagged.hpp where smem_ptr_flag_bits is defined.

In practice users typically reach this header via cute/tensor.hpp (which
already pulls in pointer_flagged.hpp), so the bug only surfaces when the
header is included directly. clangd and isolated translation units that
include only print_tensor.hpp report "use of undeclared identifier
'smem_ptr_flag_bits'".

Fixes NVIDIA#3205
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] [CuTe] print_tensor.hpp missing include for cute/pointer_flagged.hpp

1 participant